NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fast Non-Log-Concave Sampling under Nonconvex Equality and Inequality Constraints with Landing

Jeon, Kijung; Muehlebach, Michael; Tao, Molei (September 2025, NeurIPS)

Free, publicly-accessible full text available September 18, 2026
Zeroth-Order Optimization Finds Flat Minima

Zhang, Liang; Li, Bingcong; Thekumparampil, Kiran_Koshy; Oh, Sewoong; Muehlebach, Michael; He, Niao (June 2025, https://doi.org/10.48550/arXiv.2506.05454)

Zeroth-order methods are extensively used in machine learning applications where gradients are infeasible or expensive to compute, such as black-box attacks, reinforcement learning, and language model fine-tuning. Existing optimization theory focuses on convergence to an arbitrary stationary point, but less is known about the implicit regularization that provides a fine-grained characterization of which particular solutions are reached. This paper shows that zeroth-order optimization with the standard two-point estimator favors solutions with small trace of Hessian, a measure widely used to distinguish between sharp and flat minima. The authors provide convergence rates of zeroth-order optimization to approximate flat minima for convex and sufficiently smooth functions, defining flat minima as minimizers that achieve the smallest trace of Hessian among all optimal solutions. Experiments on binary classification tasks with convex losses and language model fine-tuning support the theoretical findings.
more » « less
Free, publicly-accessible full text available June 5, 2026
On Constraints in First-Order Optimization: A View from Non-Smooth Dynamical Systems

Muehlebach, Michael; Jordan, Michael I (July 2021, ArXivorg)
null (Ed.)
We introduce a class of first-order methods for smooth constrained optimization that are based on an analogy to non-smooth dynamical systems. Two distinctive features of our approach are that (i) projections or optimizations over the entire feasible set are avoided, in stark contrast to projected gradient methods or the Frank-Wolfe method, and (ii) iterates are allowed to become infeasible, which differs from active set or feasible direction methods, where the descent motion stops as soon as a new constraint is encountered. The resulting algorithmic procedure is simple to implement even when constraints are nonlinear, and is suitable for large-scale constrained optimization problems in which the feasible set fails to have a simple structure. The key underlying idea is that constraints are expressed in terms of velocities instead of positions, which has the algorithmic consequence that optimizations over feasible sets at each iteration are replaced with optimizations over local, sparse convex approximations. The result is a simplified suite of algorithms and an expanded range of possible applications in machine learning.
more » « less
Full Text Available

Search for: All records